List Scheduling in the Presence of Branches: A Theoretical Evaluation
نویسندگان
چکیده
The extraction of operation level parallelism from sequential code has become an impor tant problem in compiler research due to the proliferation of superscalar and VLIW architectures This problem becomes especially hard for code containing a large number of conditional branches In this paper we extend previous work on straight line code scheduling by looking at branching task systems whose control ow graph is acyclic First we de ne an optimality measure based on the probability of the various execution paths Then we apply a list scheduling algorithm to these sys tems and derive a worst case performance guarantee for this method Finally we show that there are branching task systems for which this bound is almost tight Introduction With the wide spread use of microprocessors capable of executing multiple operations per cycle extraction of ne grain parallelism from sequential programs is regaining momentum This concept dates back to the s where machines like the IBM or the CDC provided hardware mechanisms to exploit operation level parallelism automatically Due to the frequency of conditional jumps in system code this purely hardware based approach rarely exceeded speedup factors of two or three In the early s Fisher developed an innovative compilation technique called trace scheduling that went beyond the conditional jump barrier in its quest to extract parallelism Fisher subsequently introduced an architectural paradigm termed VLIW which by employing a trace scheduling compiler was claimed to provide high performance at low cost Today all systems that boost performance by exploiting ne grain parallelism combine multiple func tional units single thread of control machines with sophisticated compilers Several new compilation algo rithms such as percolation scheduling or region scheduling have generalized the ideas behind trace scheduling for non numerical programs However for most of these techniques the actual motion of operations beyond conditional branches has been given priority over mechanisms for the selection of the operations to move Trace scheduling is an exception as operations from the execution path with highest probability are always chosen to be the subject of a transformation But to date no theoretical performance evaluation has been presented for this or any other scheduling heuristic dealing with conditional branches This is in contrast with the large body of theoretical results known for scheduling problems in the absence of conditional operations In general these problems are NP hard Frequently a classical heuristic called list scheduling is employed to guarantee close to optimum performance There operations are rst ordered in a priority list Instructions are then constructed in a top down fashion by selecting operations from this priority list and moving them to the instruction This procedure guarantees in general a nal running time of at most m times the optimum where m is the number of operations that can be executed concurrently In this paper we show that a generalization of the list scheduling heuristic in the presence of branches limits the deviation from the optimum to the factor m m dlog me The remainder of the paper is structured as follows Section introduces branching task systems which formalize the notion of programs containing conditionals Section explains our machine model and de nes schedules containing branches Next Section de nes optimality while Section explains how list scheduling has been extended in the presence of branches and gives its new performance guarantee Finally Section gives an example which shows that the performance bound established in Section is almost tight Branching Task System A conventional task system comprises a set of operations O and a precedence relation on O The operations must be executed so that the dependence constraints dictated by are respected in the nal schedule To formalize the notion of an acyclic program containing branches we extend this de nition by adding conditionals that is operations whose outcome determines the next set of operations to execute De nition Branching Task System A triple T O G consisting of a set of operations O a control ow graph G and a dependence relation is called a branching task system if the following conditions are valid G is an acyclic single entry single exit di graph with vertex set O f g such that no operation in G has out degree greater than Operations with out degree are called conditionals is G s entry and has out degree one while is G s exit A path from the entry to the exit is called an execution path of T The set of all such paths is denoted P T For any op O P op T denotes the set of execution paths traversing op For each execution path P is a partial order on P compatible with its linear ordering that is op op only if op precedes op in P An example of a branching task system T is given in Figures and Figure gives the low level code generated for a procedure computing the square roots of the polynomial a x b x c with a The precedence relation of the branching task is given in Figure The relation is portrayed in the form of a dependence graph where a solid edge from an operation op to an operation op denotes op op Note that output dependencies between operations on di erent execution paths as e g between op and op are realized by introducing static dependencies from appropriate conditional branches to these operations There are various other possibilities to address this problem as e g renaming However a discussion of these issues is beyond the scope of this paper The control ow can also be extracted from Figure Within each block the control ow is de ned by the numerical order of the operation with the conditional operation being the last Between blocks the control ow is represented by dashed edges Vertex is the single predecessor of op while vertex is the successor of operations op op and op Note that time is considered to be a discrete rather than a continuous entity Further it is assumed that every operation requires a single unit of time to execute The use of multi cycle operations is more thoroughly discussed in In general it can be stated that the bound derived in this paper is no longer valid when operations have arbitrary durations In this case list schedules may yield arbitrarily poor performance For the sake of simplicity we will also require that the control ow graph G f g is a tree although the presented results apply to arbitrary control ow graphs as well For more general branching task systems it may be necessary to sacri ce space performance in order to obtain even a modest speedup More speci cally a speedup as little as may require exponential code size Thus for branching tasks whose control ow graph is not a tree time and space performance can be antipodal procedure Poly Roots a b c incoming x x roots outgoing is r b b op r a op r c r op r r r op if r then cj r a op if r then cj r b op x r r op roots op else r sqrt r op r r b op x r r op r r b op x r r op roots op end if else roots op end if end Poly Roots Fig Code to compute the roots of a degree polynomial This phenomenon can be intuitively explained by considering the number of execution paths of a control ow graph If the control ow graph is a tree the overall number of execution paths is equal to the number of leaf operations in the graph However in an arbitrary control ow graph with n operations there can be close to n execution paths Machine Model and Branching Schedules Our machine model is capable of executing m arbitrary operations per time unit The set of operations executed in a given time instant is called an instruction When an instruction I contains k m conditionals these are arranged to form a decision tree with k outgoing branches that speci es which instruction must be executed next This machine model is inspired by the branching paradigm of Karplus Nicolau and Ebcio glu A formal de nition is given below De nition Branching Schedule A branching schedule of a comprises a set of instructions I and a control ow graph G An instruction is a set of operations If every instruction contains at most m operations is said to be an m schedule G is an acyclic single entry single exit di graph with vertex set I f g An instruction I has out degree k i it contains k conditionals A path from the entry to the exit is called an execution path of The set of all such execution paths is denoted P The length d P of an execution path P P is the number of instructions traversed by P As we have assumed that the control ow graph of a branching task is a tree the control ow graph of a branching schedule will also be a tree
منابع مشابه
The Extraction of Influencing Indicators for Scoring of Insurance Companies Branches Based on GMDH Neural Network
O ne of the key topics and the most important tools to determine the strengths, weaknesses, opportunities and threats of each organization and company is the evaluation the performance of organizational activities that rating and ranking follows the internal and external goals. In this regard insurance companies similarly are looking for evaluation of their branches through scoring, ...
متن کاملTheoretical study on the mechanism of stable phosphorus ylides derived from 5-aminoindazole in the presence of different dialkyl acetyelenedicarboxylates
In the recent work, the reaction mechanism between triphenylphosphine 1, dialkyl acetylenedicarboxylates 2 in the presence of NH-acid, such as 5-aminoindazole 3 were investigated theoretically. Quantum mechanical studies were performed for evaluation of potential energy surfaces of all structures participated in the reaction mechanism both in gas phase and in dichloromethane. The first step of ...
متن کاملEvaluating the SAMT English Textbook for BSc Students of Physics
English for Academic Purposes (EAP) and English for Specific Purposes (ESP) are developing branches of English as a Foreign Language (EFL) instruction in Iran. These branches have a marginal status in the tertiary education, and the Ministry of Science, Research, and Technology’s high commission offers no clear guidance for selecting and developing basic academic instructional materials related...
متن کاملPlanning in a cross dock network with an operational scheduling overview
Nowadays, cross docking plays an important role in the supply chain networks especially in transportation systems. According to the cross dock system advantages such as reducing transportation costs, lead times, and inventories, scheduling in a cross-dock center would be more complicated by increasing the number of suppliers, customers and product types. Considering the cross dock limited capac...
متن کاملRobust gain-scheduled control of linear parameter-varying systems with uncertain scheduling parameters in the presence of the time-invariant uncertainties
In this paper, a new approach is presented to design a gain-scheduled state-feedback controller for uncertain linear parameter-varying systems. It is supposed that the state-space matrices of them are the linear combination of the uncertain scheduling parameters. It is assumed that the existed uncertainties are of type of time-invariant parametric uncertainties with specified intervals. Simulta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 196 شماره
صفحات -
تاریخ انتشار 1996